Hopfield networks are a kind of recurrent neural network that model auto-associative memory: the ability to recall a memory from just a partial piece of that memory.
In [1]:
import os
import numbers
import numpy as np
import imageio
import matplotlib
from matplotlib import pyplot as plt
from scipy.misc import imread, imsave
%matplotlib inline
matplotlib.rcParams['figure.figsize'] = (20.0, 10.0)
np.random.seed(3)
Let's load in a meme. I'm partial to 'Deal with it'.
In [2]:
#deal = 2 * np.random.binomial(1,.5,size=(5,5)) - 1
#deal = imread('obama.png', mode="L")
deal = imread('small-deal-with-it-with-text.jpg', mode="L")
print(deal.shape)
deal = deal.astype(int)
In [3]:
np.unique(deal)
Out[3]:
To convert this to a 1 bit image, I convert everything darker than some threshold to black (1), and everything else to white (-1). Experimenting a bit with the particular image of the 'deal with it meme' that I have, a threshold of 80 seemed to work reasonably. The resulting image is still a bit rough around the edges, but it's recognizable.
In [4]:
bvw_threshold = 80
deal[deal <= bvw_threshold] = -1
deal[deal > bvw_threshold] = 1
deal = -deal
deal
Out[4]:
In [5]:
np.unique(deal)
Out[5]:
In [6]:
plt.imshow(deal, cmap='Greys', interpolation='nearest');
Now train the weights. Whereas before we used Hebb's rule, now let's use the Storkey Learning Rule. This rule has a few nice advantages over Hebb's rule: it allows the network to learn more patterns (the 'capacity is n/sqrt(2*ln(n))
where n
is the number of neurons in the network), its basins of attraction (to the stored patterns) are larger, the distribution of basin sizes is more even, and the shapes of the basins are more round. The weights at time v
are:
where
and n
is the number of neurons and $\epsilon$ is a bit (+1 or -1) of the pattern being trained at time v
.
The second term of the rule is basically the Hebbian rule. The third and fourth terms basically account for the net input to neurons j and i using the current weights.
To see the development/testing of the below implementation of the Storkey rule, see 'Hopfield Network of memes--Storkey Learning Rule development'.
In [48]:
def storkey_rule(pattern, old_weights=None):
"""
pattern: 2-dimensional array
old_weights: square array of length pattern.shape[0]*pattern.shape[1]
"""
mem = pattern.flatten()
n = len(mem)
if old_weights is None:
old_weights = np.zeros(shape=(n,n))
hebbian_term = np.outer(mem,mem)
net_inputs = old_weights.dot(mem)
net_inputs = np.tile(net_inputs, (n, 1)) # repeat the net_input vector n times along the rows
# so we now have a matrix
# h_i and h_j should exclude input from i and j from h_ij
h_i = np.diagonal(old_weights) * mem # this obtains the input each neuron receives from itself
h_i = h_i[:, np.newaxis] # turn h_i into a column vector so we can subtract from hij appropriately
h_j = old_weights * mem # element-wise multiply each row of old-weights by mem
np.fill_diagonal(h_j,0) # now replace the diagonal of h_j with 0's; the diagonal of h_j is the
# self-inputs, which are redundant with h_i; np.fill_diagonal modifies inplace
hij = net_inputs - h_i - h_j
post_synaptic = hij * mem
#pre_synaptic = post_synaptic.T
pre_synaptic = hij.T * mem[:, np.newaxis]
new_weights = old_weights + (1./n)*(hebbian_term - pre_synaptic - post_synaptic)
return new_weights
This next cell can take a little while if the image is large. For an image of size 128x128, it takes a minute or two.
In [8]:
deal_weights = storkey_rule(deal, old_weights=None)
deal_weights
Out[8]:
Now start with a noisy version of the image. We'll just flip a certain number of random pixels on each row of the image.
In [9]:
def noisify(pattern, numb_flipped=30):
noisy_pattern = pattern.copy()
for idx, row in enumerate(noisy_pattern):
choices = np.random.choice(range(len(row)), numb_flipped)
noisy_pattern[idx,choices] = -noisy_pattern[idx,choices]
return noisy_pattern
noisy_deal = noisify(pattern=deal)
In [10]:
plt.imshow(noisy_deal, cmap='Greys', interpolation='nearest');
Now we can start with that, and use the weights to update it. We'll update the units asynchronously (one at a time), and keep track of the energy of the network every so often.
In [46]:
def flow(pattern, weights, theta=0, steps = 50000):
pattern_flat = pattern.flatten()
if isinstance(theta, numbers.Number):
thetas = np.zeros(len(pattern_flat)) + theta
for step in range(steps):
unit = np.random.randint(low=0, high=(len(pattern_flat)-1))
unit_weights = weights[unit,:]
net_input = np.dot(unit_weights,pattern_flat)
pattern_flat[unit] = 1 if (net_input > thetas[unit]) else -1
#pattern_flat[unit] = np.sign(net_input)
if (step % 10000) == 0:
energy = -0.5*np.dot(np.dot(pattern_flat.T,weights),pattern_flat) + np.dot(thetas,pattern_flat)
print("Energy at step {:05d} is now {}".format(step,energy))
evolved_pattern = np.reshape(a=pattern_flat, newshape=(pattern.shape[0],pattern.shape[1]))
return evolved_pattern
In [12]:
steps = 50000
theta = 0
noisy_deal_evolved = flow(noisy_deal, deal_weights, theta = theta, steps = steps)
In [13]:
plt.imshow(noisy_deal_evolved, cmap='Greys', interpolation='nearest');
Voila.
The cooler thing about the Hopfield networks is that they can encode multiple patterns (to a limit depending on the training regimen, and the number of units). So let's try another maymay.
I got the next meme from here, and then tweaked its levels in Mac's preview so that it'd translate nicely to a 1 bit (black or white) image.
In [14]:
woah = imread('woah.png', mode="L")
woah = woah.astype(int)
woah[woah >= 1] = 1
woah[woah < 1] = -1
woah = -woah
In [15]:
np.unique(woah)
Out[15]:
In [16]:
plt.imshow(woah, cmap='Greys', interpolation='nearest');
Cool. So now we make some weights for this image. The takes a little bit longer than the Hebbian learning rule when it is dealing with previous, nonzero weights.
In [17]:
average_weights = storkey_rule(woah, old_weights=deal_weights)
Now, let's make a noisy Neil deGrasse Tyson, and have the network try to recover the clean, pristine NGT.
In [38]:
noisy_woah = noisify(pattern=woah, numb_flipped=15)
plt.imshow(noisy_woah, cmap='Greys', interpolation='nearest');
In [19]:
recovered_woah = flow(noisy_woah, average_weights, theta = theta, steps = steps)
plt.imshow(recovered_woah, cmap='Greys', interpolation='nearest');
Now let's doublecheck that the average weights also still work for the 'deal with it' image.
In [20]:
deal_recovered = flow(noisy_deal, average_weights, theta = theta, steps = steps)
plt.imshow(deal_recovered, cmap='Greys', interpolation='nearest');
Sweet. So now we can try something like feeding it a pattern that is halfway between the two patterns -- it should eventually settle into one of them! Who has greater meme strength!??!
In [21]:
deal_with_neil = (woah + deal) / 2
print(np.unique(deal_with_neil))
I could force those 0 values to -1 or 1, but that biases the pattern towards deal and neil, respectively (at least, testing suggested this -- I think because Neil has more black pixels and Deal has more white pixels). So, I'll leave them in. I could probably solve this by randomly setting 0's to 1 or -1, but naw.
In [22]:
#deal_with_neil[deal_with_neil == 0] = -1
#np.unique(deal_with_neil)
In [23]:
plt.imshow(deal_with_neil, cmap='Greys', interpolation='nearest');
In [24]:
recovered_deal_with_neil = flow(deal_with_neil, average_weights, theta = theta, steps = steps)
plt.imshow(recovered_deal_with_neil, cmap='Greys', interpolation='nearest');
Assuming the cells/pixels of 0 were unaltered, if you run that a few times, you'll notice that sometimes it settles on Neil, and sometimes it settles on Deal!!!
Hopfield networks can also settle onto 'spurious patterns' (patterns that the network wasn't trained on). For each stored pattern x
, -x
is a spurious pattern. But also, any linear combination of the of the learned patterns can be a spurious pattern. So let's learn a third pattern, and then see the network stabilize on a simple combination of the three patterns.
In [25]:
butt = imread('dick_butt.png', mode="L")
butt = butt.astype(int)
butt[butt >= 1] = 1
butt[butt < 1] = -1
plt.imshow(butt, cmap='Greys', interpolation='nearest');
In [26]:
average_weights = storkey_rule(butt, old_weights=average_weights)
average_weights
Out[26]:
In [27]:
noisy_butt = noisify(pattern=butt)
plt.imshow(noisy_butt, cmap='Greys', interpolation='nearest');
In [28]:
recovered_butt = flow(noisy_butt, average_weights, theta=theta, steps=steps)
plt.imshow(recovered_butt, cmap='Greys', interpolation='nearest');
In [41]:
plt.imshow(woah, cmap='Greys', interpolation='nearest');
In [47]:
np.random.seed(10)
recovered_woah = flow(woah, average_weights, theta=theta, steps=70000)
plt.imshow(recovered_woah, cmap='Greys', interpolation='nearest');
I don't understand why it settles on that...even if given the 'woah' image it was trained on, it evolves towards that pattern....
Okay, now let's make a spurious pattern. Any linear combination will do.
In [30]:
spurious_meme = butt + deal + woah
np.unique(spurious_meme)
Out[30]:
In [31]:
spurious_meme[spurious_meme > 0] = 1
spurious_meme[spurious_meme < 0] = -1
In [32]:
plt.imshow(spurious_meme, cmap='Greys', interpolation='nearest');
Pretty noisy. Only Neal, and kiiiiinda the Deal with It, are visible. Now make a noisy version of that combination.
In [33]:
noisy_spurious_meme = noisify(pattern=spurious_meme)
plt.imshow(noisy_spurious_meme, cmap='Greys', interpolation='nearest');
Beautifully noisy. Can barely see anything in it. But now if we start with that, and apply the weights, it should recover the spurious pattern!
In [34]:
steps = 100000
recovered_spurious_meme = flow(noisy_spurious_meme, average_weights, theta=theta, steps=steps)
plt.imshow(recovered_spurious_meme, cmap='Greys', interpolation='nearest');
And it sure as heck did.
In [ ]: